# **Measuring the Impact of Peacebuilding Interventions on Rule of Law and Security Institutions**

Vincenza Scherrer

06SSRpaperFRONT\_16pt.pdf 1 31.05.2012 17:23:33 6SSRpaperFRONT\_16pt.pdf

C M Y CM MY CY CMY K

DCAF a centre for security, development and the rule of law

SSR PAPER 6

# **Measuring the Impact of Peacebuilding Interventions on Rule of Law and Security Institutions**

Vincenza Scherrer

DCAF

Published by Ubiquity Press Ltd. 6 Osborn Street, Unit 2N London E1 6TD www.ubiquitypress.com

Text © Vincenza Scherrer 2012

First published 2012 Transferred to Ubiquity Press 2018

Cover image © UN Photo, unmultimedia.org

Editors: Alan Bryden & Heiner Hänggi Production: Yury Korobovsky Copy editor: Cherry Ekins

ISBN (PDF): 978-1-911529-33-0 ISSN (online): 2571-9297

DOI: https://doi.org/10.5334/bbq

This work is licensed under the Creative Commons Attribution 4.0 International License (unless stated otherwise within the content of the work). To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. This license allows for copying any part of the work for personal and commercial use, providing author attribution is clearly stated.

This book was originally published by the Geneva Centre for the Democratic Control of Armed Forces (DCAF), an international foundation whose mission is to assist the international community in pursuing good governance and reform of the security sector. The title transferred to Ubiquity Press when the series moved to an open access platform. The full text of this book was peer reviewed according to the original publisher's policy at the time. The original ISBN for this title was 978‐92‐9222‐204‐8.

*SSR Papers* is a flagship DCAF publication series intended to contribute innovative thinking on important themes and approaches relating to security sector reform (SSR) in the broader context of security sector governance (SSG). Papers provide original and provocative analysis on topics that are directly linked to the challenges of a governance‐driven security sector reform agenda. SSR Papers are intended for researchers, policy‐makers and practitioners involved in this field.

The views expressed are those of the author(s) alone and do not in any way reflect the views of the institutions referred to or represented within this paper.

#### Suggested citation:

Scherrer, V. 2018. *Measuring the Impact of Peacebuilding Interventions on Rule of Law and Security Institutions.* London: Ubiquity Press. DOI: https://doi.org/10.5334/bbq. License: CC-BY 4.0

# Contents


## **INTRODUCTION**<sup>1</sup>

Peacebuilding aims to lay the foundations for sustainable peace and development by promoting measures that seek to reduce the risk of violent conflict. <sup>2</sup> Since the 1990s, internationally‐supported peacebuilding interventions have become increasingly prominent. The number of international peacebuilding interventions, however, stands in stark contrast to our limited knowledge of their impact on the ground. Better understanding the impact of these interventions is thus a prerequisite for improving peacebuilding practice. In short, this means asking what is working, what is not, and why? At the same time today's difficult economic climate has only added impetus to the need to understand the impact of peacebuilding interventions. Yet if the need to measure impact appears self‐evident, conceptual advances in how to measure it nevertheless remain limited. This is simply unsustainable given the need to maximise the effectiveness of what are major international commitments.

Activities focusing on rule of law and security institutions are a key component of the peacebuilding agenda. According to the Capstone doctrine for UN peacekeeping operations, 'Security Sector Reform (SSR) and other rule of law‐related activities' constitute 'critical peacebuilding activities'. <sup>3</sup> In simple terms, rule of law and security institutions are responsible for the provision, management and oversight of justice and security in a country. Legal, judicial, prison and police institutions constitute core rule of law institutions.<sup>4</sup> Security institutions can include a wide variety of actors, such as the armed forces, police, corrections, intelligence services

#### 6 *Vincenza Scherrer*

and institutions responsible for border management, customs and civil emergencies, as well as those bodies that manage and oversee the delivery of security such as ministries, legislative bodies and civil society groups.<sup>5</sup> Drawing on a background study prepared for the Office of Rule of Law and Security Institutions (OROLSI) at the United Nations Department of Peacekeeping Operations (DPKO), <sup>6</sup> this paper purposefully uses the terminology of OROLSI, which links rule of law and security institutions to areas such as justice, corrections, police, disarmament, demobilization and reintegration, security sector reform and mine action.7 

While policy frameworks concur that activities should be interrelated, support to rule of law and security institutions tends to be delivered in a siloed fashion. Moreover, programming often takes place on the assumption that activities will lead to positive, long‐term change that will be felt in the lives of beneficiaries. Rarely are these assumptions tested. In fact, evaluations – when they occur – are often heavily focused on 'outputs' that do not shed light on the value of international support. Impact measurement can therefore help to focus on the bigger picture, understanding how efforts complement or contradict one another.<sup>8</sup> Such an approach can offer major benefits for the coherence, sequencing, and coordination of international support.<sup>9</sup> 

Pressure to measure the impact of international support to rule of law and security institutions in peacebuilding interventions will only increase.<sup>10</sup> Demands for a better understanding of impact measurement began in the development community and have now spread to the humanitarian and peacebuilding communities. These demands have sometimes met with resistance but they have also triggered confusion over when and how impact can be measured. Impact has often been perceived as a particularly elusive level of the results chain where the contribution of an intervention cannot be proven.<sup>11</sup> Furthermore, there is a tendency to perceive impact as being visible only several years after an intervention and therefore as too long term to be measured effectively for the purposes of programming and policy. The high costs of effective impact assessment are also put forward as a disincentive. However, if there is scepticism over the feasibility of determining impact, there is also a growing consensus that impact measurement can be 'demystified'.<sup>12</sup> This would result in much needed clarity on a number of levels: 'to know whether the intervention has worked, to learn from it, to increase transparency of the intervention, and to know its "value for money"'.13

The objective of this paper is to better understand ways to measure the impact of peacebuilding interventions on rule of law and security institutions. In particular, it seeks to identify impact assessment methodologies that can be applied in complex, multi‐layered post‐conflict interventions. Emphasis is placed on how bilateral and multilateral actors have approached this challenge in practice, with a particular focus on the role of the United Nations.14

The measurement of impact poses a number of dilemmas, including weighing the relatively high costs of assessments against the potential benefits and disaggregating the contributions of specific interventions given the wide range of actors concerned. In view of these concerns, this paper highlights methodologies that can be used for measuring impact drawing on both qualitative and quantitative approaches. While it is sometimes assumed that impact can only be measured using scientific‐experimental approaches based on control trials and statistical methods, there are a number of other applicable methodologies. This understanding is an essential starting point in broadening the range of methods and approaches available for impact measurement and thereby increasing the ability of international actors to apply the methodology best suited to the immediate context, skills and resources available. This is particularly important in the area of post‐conflict peacebuilding, where the international community often struggles to meet its basic monitoring and evaluation (M&E) requirements due to capacity and resource gaps.

This study draws on a broad review of current evaluation approaches and methodologies based on extensive desk research. Narrowing the focus to the approaches most relevant for measuring impact, methodologies focussing substantially on measuring performance at the output level were discarded. The evaluation approaches of nineteen international actors (bilateral and multilateral) were reviewed to decipher how they approach the need to measure impact.<sup>15</sup> The research examines primary sources (e.g. evaluations, official documents, reviews) and secondary sources (e.g. handbooks, guidance notes). Meta‐evaluations, which evaluate the evaluations of international actors, were found to be particularly useful because they compare the approaches used in dozens of evaluations providing detailed information on the average timing of evaluation projects, the composition of evaluation teams and the challenges encountered. Finally, the policy and academic literature on evaluation in post‐conflict contexts and impact assessments provided useful background to the analysis.<sup>16</sup> While this study cannot claim to cover all possible approaches and methodologies for measuring impact, it has identified those considered most useful for international actors seeking to measure the impact of peacebuilding support to rule of law and security institutions.<sup>17</sup>

This study is based on definitions found in the OECD DAC Glossary of Key Terms in Evaluation and Results Based Management, a reference commonly used by international actors. 'Impact' is defined as 'positive and negative, primary and secondary long‐term effects produced by a development intervention, directly or indirectly, intended or unintended'.<sup>18</sup> The definition of impact *stricto sensu* as contained in the OECD DAC glossary is that it is 'produced by a development intervention' – thus suggesting that impact can be 'attributed' to a specific intervention. However, when it comes to measuring impact there is a debate about the validity of *attribution* versus *contribution*. Attribution is often promoted as the 'gold standard' because of its ability to demonstrate a direct causal link between an intervention and its impact. However, in complex post‐conflict settings it is considered extremely difficult to isolate the effects of a particular peacebuilding intervention and thus to establish a causal link between the intervention and the observed outcomes and impacts. To avoid this so‐called 'attribution gap', <sup>19</sup> efforts have increased to demonstrate the plausible contribution of an intervention to observed outcomes and impacts. Focusing on contribution as opposed to attribution recognises that there may be other factors that have also contributed to the observed impact. This is particularly relevant in post‐conflict contexts as it takes into account the complexity of 'tracking causality' in the 'non‐linear multi‐agency contexts' within which peacebuilding support takes place.<sup>20</sup>

This paper uses the term 'impact assessment' as opposed to 'impact evaluation' in order to avoid confusion with the scientific‐experimental approach to evaluating impact which is commonly referred to as 'impact evaluation'. The term 'evaluation' is used in general terms while 'assessment' is used when following the word 'impact'. The definition of 'evaluation', based on OECD usage, is 'the systematic and objective assessment of an on‐going or completed project, programme or policy, its design, implementation or results. The aim is to determine the relevance and fulfilment of objectives, development efficiency, effectiveness, impact and sustainability.'<sup>21</sup> Impact is thus only one of five distinct criteria that might be used in evaluations, depending on what the evaluation is seeking to understand.

'Impact assessment methodologies' are therefore understood as those methodologies that seek to evaluate impact (including the negative, positive, intended and unintended effects of an intervention). The study does not examine impact assessments that are conducted *ex ante* to assess the potential future effects of an intervention. Instead the focus is on those approaches that could be adapted to analyse support to rule of law and security institutions, and which could be of relevance to peacebuilding. While monitoring and evaluation (M&E) are closely linked, this paper focuses specifically on evaluations. Monitoring is thus addressed only in so far as it relates to the goal of evaluation.

The paper begins with an overview of the methodological approaches and methodologies that can be used to evaluate impact. The experience of international actors in promoting and measuring impact are then examined to identify how international actors are currently approaching this issue. The paper then outlines the building blocks of an approach to measuring the impact of peacebuilding interventions in support of rule of law and security institutions in host countries. The paper concludes by identifying key findings and recommendations.

## **IMPACT ASSESSMENT: APPROACHES AND METHODOLOGIES**

This section provides a brief overview of the origins and development of impact assessments, highlighting their evolution from assessment of the potential future consequences of programmes to examining the effects of interventions. The methodological approaches – providing frameworks within which the specific methodologies can be understood – are presented and summarised. The discussion is intended to facilitate understanding of what tools are available depending on the desire of individual actors to use a specific evaluation approach.

## **Origins and Development of Impact Assessments**

Impact assessments were originally developed and used in the public sector to assess the potential future effects – intended and unintended, negative and positive – of public policies (*ex ante* assessments).<sup>22</sup> This practice was then adopted by the development community, with international agencies seeking to mitigate any negative consequences of their activities by assessing the potential future environmental, social, health and economic impacts of their interventions. <sup>23</sup> With the increasing emphasis on aid effectiveness in the 1990s, impact assessments began to be used for post‐ intervention evaluation and not just as a pre‐intervention planning tool.<sup>24</sup> Impact has thus become one of the main evaluation criteria in the OECD DAC guidelines (alongside relevance, efficiency, effectiveness and sustainability). There has also been increased interest in the creation of methodologies to assess and measure impact – mainly with a view to assessing the long‐term effects of an intervention.

A similar evolution can be seen in the field of peacebuilding. The term 'impact' was first used by Kenneth Bush's Peace and Conflict Impact Assessment (PCIA) approach in 1998.25 This approach aims to assess *ex ante* the potential impacts of development projects within the wider context of peace and conflict. Mary Anderson's 'do no harm' approach (1996) is a similar *ex ante* assessment of the potential negative impacts of development and humanitarian interventions, and prepared the ground for the widely accepted principle of conflict sensitivity.<sup>26</sup> Thania Paffenholz and Luc Reychler developed the 'Aid for Peace' approach in 2007<sup>27</sup> – also seen as third‐generation PCIA – which proposes a four‐stage impact assessment (needs, relevance, risks and effects) for development, humanitarian and peacebuilding interventions.28

Interest in monitoring and evaluation of peacebuilding interventions has thus grown since the late 1990s and early 2000s.29 Increasing demands for proven effectiveness and measurable results in the peacebuilding field has raised awareness of the need to measure impact, but at the same time measuring impact in complex post‐conflict settings has been considered especially difficult. <sup>30</sup> There is a school of thought that suggests it is impossible to attribute impacts to a given peacebuilding intervention: in particular, it has been suggested that to measure impact a great deal of sophisticated data collection and analysis over a long period of time is needed, and that 'these requirements either exceed the capacity of many organisations practicing peacebuilding or they extend beyond the donors' funding period'.31 There have therefore been calls to focus on outcomes rather than impact.<sup>32</sup> Other voices argue that it is possible to assess impact by relying exclusively on scientific‐experimental approaches like those applied in development interventions.33 But in practice this approach is seldom used in peacebuilding contexts due to its resource‐intensive nature and the fact that it is most usefully applied *ex post*. <sup>34</sup> Yet as in the development field, other approaches to measuring impact that focus on 'contribution' as opposed to 'attribution' are attracting increasing recognition.<sup>35</sup>

One of the most important contributions to impact assessments in peacebuilding evaluations stems from Collaborative Learning Project's Reflecting on Peace Practice (RPP). RPP collaborators challenge the resistance of many peacebuilding practitioners to assessing impacts. In particular, they point out that impact has to be 'demystified' and 'is not always elusive and unreachable, too long‐term or impossible to assess'.36 There can be short‐term impacts and long‐term impacts. The authors advocate that each peace project be held accountable for its contribution to the broader peace, or 'peace writ large'.37 The RPP approach provided the basis for new OECD DAC guidance on Evaluation of Conflict Prevention and Peacebuilding Activities.<sup>38</sup> The guidance highlights the challenges of M&E in conflict and post‐conflict settings, and combines the OECD DAC evaluation criteria (relevance, efficiency, impact, effectiveness, sustainability and coherence) with a conflict‐sensitive M&E approach. Most importantly, it takes up the RPP definition of impact, noting that it can also look at short‐term and not just long‐term impact.<sup>39</sup>

In sum, while impact assessments stem from attempts to assess future consequences of activities, they have been adopted as a tool to measure the impact of ongoing or completed projects. Moreover, while traditional understandings of evaluation at the impact level was that they had to be undertaken *ex post* (several years after an intervention is in place or completed), there is a growing view that impact can be measured in the more immediate term. This emerging approach offers opportunities for international actors that need to measure impact but cannot wait until the end of an intervention to receive much needed information on what is working, what is not, and why.

## **Evaluation Approaches and their Relation to Impact**

Evaluation approaches refer to the principles guiding the design (and implementation) of the evaluation. In other words, they provide 'the framework, philosophy, or style of an evaluation'.<sup>40</sup> There are a large number of evaluation approaches that may frame impact assessment methodologies. This section does not seek to be exhaustive, but rather provides an eclectic overview of some of the main approaches used in the evaluations of international actors as relevant to the assessment of impact in peacebuilding contexts. It should be noted that evaluation approaches are not necessarily mutually exclusive, and can often be used in a complementary manner.

## **Table 1: Overview of Approaches Examined**


The approaches discussed are separated in Table 1 according to how they relate to measuring impact: whether they can demonstrate attribution, show plausible contribution, or can help to focus the evaluation on impact without necessarily seeking causality.

A range of approaches used by international actors are amenable to measuring impact. The approach claimed to provide the strongest causal link between the intervention and the impact is the *scientific‐experimental approach*. <sup>42</sup> It is often considered as the 'gold standard' in assessing impact due to its ability to show attribution. It is based on counterfactual analysis that enquires what would have happened to the welfare of the beneficiaries if the intervention had not taken place. Based on statistical methods, comparison groups are established to examine the difference between those who received support from an intervention and those who did not (e.g. randomised control trials). This difference is then considered as impact, based on statistically rigorous quantitative measurement techniques. While this approach is considered standard in the development field, its relevance in post‐conflict settings is increasingly questioned.<sup>43</sup> This is because complex settings often include a multitude of actors and a number of influencing factors, which make attribution difficult. Moreover logistical challenges such as the difficulty of identifying control groups in multi‐site interventions and associated costs are another concern. Finally, the intention to attribute impact rather than only recognize contribution raises ethical questions in so far as such efforts may hamper national ownership. There are also ethical concerns in justifying the limitation of the intervention to a specified group when the control group has been affected by the same factors that define the group receiving the assistance.44

Given the challenges associated with this approach, techniques that seek to demonstrate contribution as opposed to attribution have gained increasing attention. These approaches aim to show the contribution of an intervention but recognise that there are many other factors and actors at play contributing to its positive or negative impact. These include participatory methods and theory‐based approaches.

*Theory‐based approaches* focus on the underpinning assumptions (theories of change) implicit or explicit in a programme design, and aim to test these. These approaches thus seek to identify 'why' and 'how' changes take place in a project, by identifying, articulating and testing the theory of change that links the results chain of a project (from outputs to outcomes and impacts).<sup>45</sup> Such evaluations go beyond conventional results‐based approaches by testing whether an intervention was unsuccessful due to implementation flaws or inconsistent theories.46 Theory‐based evaluations are useful for both accountability and learning purposes as they provide valuable conclusions on what works, what does not and why.

*Participatory approaches* seek to involve the intervention's primary beneficiaries directly in the planning, monitoring and evaluation process.<sup>47</sup> This may be based on the use of participatory methods such as interviews, focus group discussions and workshops. Advocates of this approach to measuring impact consider local stakeholders to provide 'the single most valuable source of insights' on the impact of an intervention.<sup>48</sup> By listening to the perceptions of the beneficiaries on what initiatives have made changes in their lives, it is the local communities that can provide evidence of cause and effect.<sup>49</sup> Participatory approaches are also recognised as extremely valuable in terms of capacity building, promoting local ownership and enhancing the use and relevance of the evaluation itself. While there may be concerns over the possibility of biased findings due to conflicts of interest among local/national stakeholders, <sup>50</sup> different levels of participation can be supported depending on what methodology is used.

Finally, there are certain other non‐causal approaches to evaluation, which although they cannot demonstrate attribution or contribution may still support evaluations by making them more 'impact‐focused'. These approaches include action evaluation, goal‐free evaluation, results‐based evaluation and utilisation‐focused evaluation.

*Action evaluation* is based on the evaluator facilitating, together with the project team and key stakeholders, the identification of goals and objectives throughout the life of a programme. Progress towards the achievement of the goals/objectives is monitored and later evaluated jointly by the evaluator and project team. Essentially, this approach supports the collective identification of context‐specific criteria of success according to which an intervention is then evaluated.<sup>51</sup> The advantages of this approach are that it promotes joint understanding of success, enables adaptation to changing environments and can be used to track impact. However, this may require a significant change in mind‐set if an action‐ oriented approach seeks to allow deviation from its original goals. Also, enabling local stakeholders to define the evaluation criteria may hamper the ability to evaluate an intervention against a mandate set at the international level. The value of this approach lies in its ability to define a joint vision of impact (goals) rather than its role in data collection and analysis.52

*Goal‐free evaluation* approaches aim to examine the *actual* outcomes and impacts of interventions in contrast to those intended. The external and independent evaluator/team thus deliberately avoids prior knowledge of the intended goals and objectives of the intervention and has only minimal contact with the project team. The focus is on the actual results of the intervention rather than on the intended results.<sup>53</sup> The aim is to uncover the positive and negative effects of an intervention – whether intended or not – by limiting the bias of the evaluator. The evaluator collects information about programme results and then compares these with the actual needs of the beneficiaries in order to determine effectiveness.54 The data collection process of a goal‐free evaluation can be time‐intensive and more expensive than conventional results‐based evaluations, since the evaluator/team has to investigate a very broad range of issues, often using participatory methods involving the primary beneficiaries and other key stakeholders. <sup>55</sup> A great advantage of this approach is that it can demonstrate impact without needing a coherent logical framework that sets out the relationships between inputs, outputs, outcomes and impact. It is therefore of particular use where the programme logic is weak, as may often be the case in fast‐paced conflict environments.56

*Results‐based evaluation* is the approach most widely adopted by the international community. This approach seeks to identify to what extent the intended broad goals and specific objectives of an intervention have been met according to defined indicators, benchmarks and baseline studies. It can focus on impact only to the extent that indicators at this level are developed. While this approach does not support contribution of an intervention to an impact, it can highlight the linear relationships between outputs, outcomes and impact. Results‐based approaches do not seek to evaluate the utility of the objectives or goals set for the intervention.<sup>57</sup>

*Utilisation‐focused evaluation* is based on the understanding that evaluations 'should be judged by their utility and actual use'.<sup>58</sup> This involves a process whereby the evaluator helps users select the most appropriate approach and methods for the purpose of their evaluation. <sup>59</sup> These evaluations can thus use any type of design and methods, and can focus on any level of the results chain.<sup>60</sup> Such evaluation is therefore not specifically associated with impact, but could be adapted to measuring impact if a project team were to jointly decide on the most appropriate methods for the context at hand. The underpinning idea is that the utilisation‐focused evaluation approach deepens the feeling of ownership of the project team in the evaluation process through their active involvement, and increases the utility of the evaluation – i.e. it becomes more likely that the team will actually use the evaluation results and these will effect change in the project.<sup>61</sup> This could be of importance in the area of measuring impact where there may be reluctance to adapt to new evaluation approaches.

This approach has often been advocated by UN entities, possibly when faced with resistance to evaluations that are perceived as externally imposed.

## **Methodologies for Measuring Impact**

These broad approaches to evaluation are applied in practice through a number of methodologies. Evaluation methodology refers to 'the term covering the different methods to be applied to meet the overall purpose and objectives of the evaluation. The particular methodology to be used for data collection and analysis is determined by the subject and purpose of the evaluation.'<sup>62</sup> Some methodologies are more applicable as a learning tool because they seek to answer questions related to why the intervention had an impact or not while others may be more relevant in order to provide for accountability. The appropriate evaluation methodology should therefore be determined in view of its ability to answer the questions that the evaluation is seeking to answer.

Six methodologies seem applicable to measuring impact with particular relevance for rule of law and security institutions: a) impact evaluation; b) theory‐based impact evaluation; c) contribution analysis; d) outcome mapping; e) RAPID outcome assessment, and f) most significant change (MSC). All of these methodologies have in common that they can be used to illustrate impact. However, they are all different in terms of the evaluation questions they can most usefully answer, and in terms of the time, cost and skills‐sets required to use them. Moreover, some of these can be used independently while others must be combined with other methodologies in order to provide a fuller picture for evaluation purposes. The following table provides an overview:

## **Table 2: Overview of Methodologies for Measuring Impact**


## *Impact evaluation*

Impact evaluations aim to measure and establish the value of impacts that can be *attributed* to an intervention. The methodology is based on the

definition of cause‐effect hypotheses to be tested and the use of quantitative methods for data collection and analysis. <sup>63</sup> This is done through a 'counterfactual' analysis of the impacts of an intervention, i.e. a comparison of what actually happened with what would have happened in the absence of the intervention. Impact evaluations employ experimental and quasi‐experimental quantitative methods. They are useful for answering evaluation questions relating to 'whether development interventions do or do not work, whether they make a difference, and how cost‐effective they are.'<sup>64</sup>

Impact evaluations are usually *ex post*, meaning that they are conducted some time after the end of an intervention. There are two main components:


Applying the Impact Evaluation methodology generally involves three key phases:67


An impact evaluation is generally very time intensive and the cost can be 'significant'.<sup>68</sup> The actual collection of data and observation of impacts can take years, depending on the intervention and the effects to be observed.<sup>69</sup> The statistical data analysis may take several months. Finally, impact evaluations require very sophisticated scientific skill‐sets. A drawback of impact evaluation is that it cannot highlight why events evolved the way they did because it focuses more on quantifying impact. In order to adapt to the growing demand to understand how impacts emerge, theory‐based impact evaluation has been developed as a method that combines theory of change methods with quantitative techniques (see below).<sup>70</sup>

## *Theory‐based impact evaluation*

Theory‐based impact evaluation (TBIE) takes traditional impact evaluation methods a step further by combining them with theory of change methods to generate better understanding of the underlying reasons for success (or failure). Applying the TBIE approach increases the policy relevance of scientific‐experimental impact evaluations <sup>71</sup> by going beyond the determination of an impact to interpret and evaluate the findings.<sup>72</sup> Such evaluations aim not only to establish what impact a program has had but also to "understand *why* a program has, or has not, had an impact"*.* 73

Mixed methods (qualitative and quantitative) for data collection and analysis are used.<sup>74</sup> Building on the quantitative counter‐factual based techniques described above, complementary qualitative methods may include for example: Document analysis (for example, analysing project documents to understand the project logic; academic and political literature to understand the context and inform the evaluation design), focus group discussions, various types of interviews with project staff, beneficiaries and key stakeholders.

The application of the TBIE approach is based on six steps:75


designing the evaluation. It should support an awareness of the social, political and economic factors that may affect the context of the program or its evaluation.


While this approach is widely accepted in theory,<sup>76</sup> its actual use in impact evaluations has been limited. This may be due to the significant costs in time and resources associated with implementing the methodology, given the need to combine both rigorous scientific methods and detailed analysis of theory of change. The advantage of the methodology is that if scientific methods are warranted, this version enables using the evaluation as a learning tool and not just as a means of accountability. The use of impact evaluation is also limited by the fact that sufficient time must pass before statistically relevant information can be generated.

## *Contribution analysis*

Contribution analysis aims to provide plausible evidence of the difference a programme is making (contributing) to observed outcomes. It does so by assessing the underlying theory of change and disaggregating analysis at each level of the results chain. A performance story is then built about the

#### 22 *Vincenza Scherrer*

contribution the intervention has made based on the analysis of the programme logic, results achieved and having considered alternative causal explanations for the same results (other influencing factors). Contribution analysis is different to traditional results‐based tools in that in addition to assessing evidence linking a programme to results, consideration is also given to assessing the assumptions of the theory of change and the influence of external factors and actors. A performance story is developed to explain why it is reasonable to assume that the actions of the programme have contributed to the observed outcomes. Additional evidence is then sought to support the claims. A contribution analysis is usually conducted by an independent, often external, evaluation expert or team. However, most of the aforementioned steps can also be conducted in a participatory way with some involvement of the key stakeholders and beneficiaries.<sup>77</sup> For example, the performance story can be submitted to relevant stakeholders to test their agreement with it and help identify where the weaknesses lie. A prerequisite for contribution analysis is that the programme or project has been planned and implemented within the logical framework approach, based on a specific theory of change.

The methodology employs a mixed‐methods approach for data collection and analysis. In addition to quantitative and formal data sets, it is suggested that the evaluator collect qualitative data including for example case studies based on documentary analysis, various types of participant interviews or use of anecdotal and informal stories (as used in Most Significant Change technique).<sup>78</sup>

Contribution analysis involves six iterative steps:<sup>79</sup>


This methodology is easy to use, building on the logical framework approach. It requires additional information in the logical framework from the outset to support the subsequent testing of a clear theory of change for each level, the assumptions of the theory of change and potential influencing factors.80

## *Outcome mapping*

The originality of outcome mapping is that instead of aiming to measure impact, it focuses on outcomes in terms of behavioural change of individuals, groups and institutions and their relationships.<sup>81</sup> Thus outcomes are defined as 'changes in the behaviour, relationships, activities, or actions of the people, groups, and organizations with whom a program works directly'.<sup>82</sup> Outcome mapping does not seek to measure impact in terms of 'tangible products', but is based on the understanding that changes in the behaviour of key stakeholders and partners will ultimately contribute to impact.<sup>83</sup>

Context‐specific progress indicators (so‐called *progress markers*) are used to document the progress of the intervention towards the desired outcomes. They can be seen as 'a graduated ladder of specific changes in boundary partner behaviour and relationships which define and describe progress towards each outcome challenge.'<sup>84</sup> The value of these markers is that they can capture types of change that are hard to measure with more 'traditional' indicators that need to meet criteria referred to as SMART: specific, measurable, achievable, relevant and time‐bound. The data collected for monitoring and evaluation purposes are mostly qualitative and anecdotal in nature. To trace changes and progress, outcome mapping suggests the use of journals (e.g. to monitor relationships) together with other data collections methods such as: different types of interviews including group techniques (interviews, focus group discussions, workshops), and document reviews. <sup>85</sup>

The outcome mapping process contains three stages: <sup>86</sup>


In theory outcome mapping relies on data generated by the programme team and boundary partners, which means that the team must have the time to dedicate to such activities, and the boundary partners (national stakeholders directly linked to the programme) must also be willing to engage actively with the evaluation. In practice however, rather than requiring boundary partners to collect the information themselves, the approach can be adapted for use with exclusively external assessments, for example by using questionnaires that examine progress markers.87

The OECD DAC handbook on security sector reform lists outcome mapping as a potential method for measuring impact in cases where 'changes in attitudes and behaviours (e.g. among senior officials working in security and justice institutions) [are] just as important as any specific practical changes that occur'.88 While outcome mapping alone is unlikely to be sufficient for conducting evaluations at the impact level, elements of this methodology can be used specifically to analyse behavioural change and then combined with other tools as part of a broader evaluation approach: For example, progress markers could be developed to supplement regular indicators. Outcome mapping can also be combined easily with the logical framework approach.

## *RAPID outcome assessment*

RAPID outcome assessment (ROA) aims to 'assess and map the contribution of a project's actions on a particular change in policy or the policy environment'.89 It is a systematic approach to collecting data in order to analyse how changes in the behaviour of key stakeholders contribute to policy changes. With the aim of learning, ROA is a flexible and adaptable methodology that can be used in combination with other M&E approaches.<sup>90</sup>

ROA draws on the outcome mapping methodology, the MSC technique and episode studies. In doing so, ROA enables triangulation of data which can therefore increase the reliability of findings.<sup>91</sup> MSC provides tools to identify and rank key changes, while episode studies provide ROA with a toolset to work backwards from an observed policy change (result) to find the most relevant factors that contributed to it.<sup>92</sup> The information gathered for the assessment is of a qualitative nature. Generally it involves background research, a workshop to identify change processes and the triangulation of data. Methods for data collection and analysis may include (the list is not exhaustive): participatory workshops, key informant interviews, narrative and anecdotal stories on contributions to key policy changes, and document analysis.

The ROA methodology has three main stages:93


The RAPID Outcome Assessment can be done internally by the project team or by an independent external expert. It is a participatory methodology, which involves all key stakeholders and project partners in all stages of the process.

## *Most significant change (MSC)*

The MSC technique is a participatory monitoring and evaluation methodology, originally developed to illustrate change processes in community‐based development projects. <sup>94</sup> The methodology aims at collecting so‐called 'significant change stories', and systematically selecting the one considered to have had the most significant impact on people's lives. The MSC approach uses participants' stories of change to spot the outcomes and impacts (intended and unintended, positive and negative) of an intervention and classify them in a hierarchical way through ongoing dialogues between the project team and the beneficiaries. It is important to highlight that MSC does not aim to measure impact with predetermined indicators, but rather to illustrate change.<sup>95</sup> The data collection, monitoring and analysis are done by those most directly involved – participants and field staff.

The main methods for data collection and analysis may include common tools of participatory M&E approaches, such as: focus group discussions, different types of interviews and workshops. The implementation of the MSC technique generally involves the following steps, although they can be adapted to fit the needs of a specific project:<sup>96</sup>


MSC can be useful in challenging contexts where a 'real‐time' impact assessment is needed to monitor the immediate effects the intervention is having on the lives of beneficiaries. <sup>97</sup> It is also considered useful in situations where there is more emphasis on significant change being achieved than on specific original objectives being met.<sup>98</sup> However, it requires 'an organizational culture where it is acceptable to discuss things that go wrong as well as success, and a willingness to try something different.'<sup>99</sup> If applied as intended, that is, as an iterative and participatory method, it is a time‐consuming approach. But it can also be simplified and adapted for use by external evaluators to supplement other evaluation methods by including questions on 'most significant change' in questionnaires or focus group meetings.

## **Summary**

This section has demonstrated that there is a wide range of methodological approaches that can be used for measuring impact, ranging from scientific‐ experimental designs to participatory or utilisation‐focused designs. There is also a range of impact assessment methodologies that can be used to



measure impact. Table 3 below shows how the impact assessment methodologies fit within the approaches discussed. This is intended to highlight which methodologies correspond to approaches that international actors would like to incorporate in how they measure impact.

The Goal‐free evaluation method is the only approach where no corresponding methodology for measuring impact was found within the limits of this study, and it is therefore not reflected in the table. However, it can be considered both an approach and a methodology since the principle guiding the design of the evaluation is that there is no knowledge of the originally intended goals. This is also how the evaluation is intended to be undertaken in practice – the methods used as part of the methodology are therefore based on the need to collect information from the beneficiaries while avoiding prior knowledge of the goals. The utilisation‐focused approach can be applicable to any methodology, as it depends on the purpose of the evaluation and the decision of the project team.

*Impact evaluation* is a scientific‐experimental approach to evaluation. Since Impact Evaluations are usually conducted ex‐post, several years after the completion of a project, they are generally not conducted in a participatory manner.

*Theory‐based impact evaluation* combines scientific‐experimental designs with a theory of change approach. It also draws on qualitative data,

provided to a certain extent by participatory methods, to enable the analysis of the underlying project assumptions and theories of change.

*Contribution analysis* is a participatory approach to the extent that the external independent evaluation team gathers the perceptions of key stakeholders and beneficiaries to test the theory of change; but it does not belong to the type of participatory approach that enables national capacity building through evaluation. Its participatory component can be strengthened by submitting a performance story to relevant stakeholders to test their agreement with it and potentially help identify where weaknesses lie. It is also a theory‐based approach in so far as the concept of theory of change is a crucial component. It builds on the results‐based approach, but it is not a results‐based approach *per se* as it requires further information on the underlying assumptions of the theory of change and assessment of the influence of external factors.

*Outcome mapping* is a participatory approach that involves all key stakeholders and to a certain extent the beneficiaries in the planning, monitoring and evaluation process. Similar to MSC methods, outcome mapping processes are implemented within an action‐orientated approach: the project team and key stakeholders are assisted by an external facilitator in setting objectives and goals for behaviour change in the planning phase of the project, and decide on ways to measure and evaluate the jointly agreed goals. Also, MSC and outcome mapping feature another characteristic of action‐orientated approaches, namely the possibility to adapt the project and the corresponding monitoring and evaluation system to changing circumstances over the course of implementation. Finally outcome mapping can be considered as a theory‐based approach, since it aims to understand the change processes a project has contributed to.

*RAPID outcome assessment* is a combination of both MSC and outcome mapping and therefore, it is also considered to be a participatory as well a theory‐based and an action‐orientated approach.

*Most significant change* is a genuine participatory approach in the sense that it relies on the active engagement of stakeholders in the evaluation as well as in the planning and monitoring phase. It can also be considered as an action‐oriented approach to the extent that stakeholders (especially the project team) come together to define their criteria of success by contributing stories of what they have considered to be most important elements of impact.

#### 30 *Vincenza Scherrer*

This section has highlighted that each methodological approach and impact assessment methodology can serve different purposes. Some approaches are more suited to supporting the participation of national actors (e.g. participatory and action‐orientated evaluation approaches) while others are more fitted for testing the theory of change (e.g. theory‐based approaches). Similarly, some methodologies are better suited for measuring behaviour change (e.g. outcome mapping) while others are better geared to answer questions related to the extent of impact achieved (e.g. impact evaluation). They all have different strengths and weaknesses. Moreover, they are not mutually exclusive. Most of them can be combined and used in a complementary manner. Most can also build on traditional results‐based management approaches used by many international actors. Building on this discussion the next section turns to examine which of these approaches and methodologies international actors are using in practice to measure their impact.

## **MEASURING IMPACT: APPROACHES OF INTERNATIONAL ACTORS**

This section examines the current approaches taken by international actors to evaluations and how these relate to measuring impact.<sup>101</sup> It looks at the challenges actors face in addressing the need to measure impact and how they seek to overcome them. The discussion considers both bilateral and multilateral actors, with emphasis on UN actors. The actors covered by the analysis are those currently most proactive in the development of approaches and methods for evaluation. These actors have also made substantial information on their approaches to measuring impact publicly available.

The discussion considers the following bilateral actors:


Multilateral actors discussed include:


## 32 *Vincenza Scherrer*


UN actors are examined separately because it is important to highlight the differences in approaches to measuring impact within the UN system itself. The UN entities examined are members of either the UN Rule of Law Coordination and Resource Group or the UN Inter‐Agency SSR Task Force.<sup>102</sup>

UN entities examined include:


The evaluation approaches of international actors examined are mostly generic – that is to say, they are broad approaches outlined in policies and guidelines that are applicable to a number of activities the actors are engaged in including but not limited to peacebuilding and support to rule of law and security institutions. The evaluations and meta‐evaluations examined in this section can be grouped into two main categories: cross‐ cutting generic (e.g. country programmes) and sectoral (e.g. health, gender, development). While the information from this section is therefore not specifically linked to rule of law and security institutions issues (due to the scarcity of available impact assessments focusing on these issues), there are several insights that can be drawn which are nonetheless relevant to measuring the impact of peacebuilding support for rule of law and security institutions.

## **Bilateral Actors**

All the bilateral actors examined have recognised the importance of measuring impact in their internal policies and/or guidelines. In some cases this is for reasons of accountability: for example, AusAID is required to demonstrate impact in informing the government's budget process.103 In other cases it is due to the recognised learning potential: for example, according to JICA's annual evaluation report, impact evaluation is considered to be important because simpler methods that compare outcomes before and after project implementation have a tendency either to over‐ or underestimate change.<sup>104</sup>

Many of the bilateral actors' policies and guidelines consider approaches that enable attribution as being the most 'rigorous'. For example, USAID considers the best impact evaluations to be those 'in which comparisons are made between beneficiaries that are randomly assigned to either a treatment or a control group'.<sup>105</sup> USAID requires all projects using untested hypotheses or new approaches to undergo impact evaluations where possible; if this is not possible, a detailed statement explaining why must be included in the final report and a performance evaluation can be done instead. <sup>106</sup> JICA also promotes methods that enable attribution, specifically undertaking a trial of four different techniques within impact evaluations.107 These were all tested in an attempt to find alternatives to randomised control trials, which were noted as being especially difficult to conduct in certain fields.<sup>108</sup>

The extent to which advocates of experimental approaches actually use them in practice is questionable. Evaluation reviews often point to measuring impact as a weak component of evaluation approaches. For example, while USAID advocates experimental methods to evaluate impact, a sample of evaluations examined revealed that while a few took a statistical/quasi‐experimental approach, the majority did not. Moreover, a meta‐evaluation of its recent evaluations noted that only 26 per cent of evaluations used the logic model in order to determine the causal relationship between inputs, outcomes and impacts.109 A similar finding is reflected in a Sida assessment of 34 evaluation reports in 2008, which found that none used experimental or quasi‐experimental methods.<sup>110</sup> Most of the data came from document reviews and open‐ended interviews.<sup>111</sup> The Sida study highlighted that less than half the reports provided a satisfactory analysis of impact.<sup>112</sup>

The Independent Commission for Aid Impact (ICAI), which is an important player in evaluating UK aid impact, notes that in general statistical methods such as randomised control trials are not to be used due to the time they take to implement and the fact that they would need to be built into the project design.<sup>113</sup> However, where there are concerns about the results of a project, quantitative analysis may be used, including statistical techniques that 'construct a suitable "counterfactual" in place of a control group, so as to test for attribution more rigorously'.<sup>114</sup> In practice, several reviews of DFID evaluations concluded that M&E were 'not sufficiently focused on impact' and 'not able to attribute results to DFID support'.<sup>115</sup> Challenges in establishing counterfactuals have been noted, particularly concerning the need to identify a counterfactual from the outset of the intervention as opposed to searching for one retrospectively.116

The majority of actors therefore promote experimental and quasi‐ experimental approaches but also note that these may be too challenging for their needs. Challenges raised relate to logistical obstacles: for example, proving attribution has been raised by CIDA as a challenge due to the difficulty in finding a robust methodology that is feasible to use.117 Sida's Evaluation Manual acknowledges that a plausible counterfactual may be difficult to establish, and in those situations weaker arguments based on expert knowledge would be acceptable.118 Similarly an AusAID evaluation also pointed to the difficulty of relying on impact evaluation due to the fact a lack of baseline and parallel activities rendered attribution difficult.<sup>119</sup> Cost‐related disincentives have also been raised: for instance, AusAID's best practice brief recognises that the greatest cost‐benefit in the area of impact assessment can be achieved by limiting itself to a 'handful' of experimental or quasi‐experimental impact evaluations.120

There is a general recognition among bilateral actors that in practice it may be more realistic to rely on 'best extent possible' approaches that seek to show contribution by other methods. <sup>121</sup> Such methodologies increasingly being used by international actors include outcome mapping, contribution analysis and MSC while few examples were found of ROA. Concerning the use of qualitative methods more generally, AusAID in




particular recognises their value specifically in addressing the reality that AusAID programmes are often delivered over short timeframes. To meet this challenge alternative approaches were sought that would show progress towards results as opposed to establishing rigorous causal links. On this basis for example contribution analysis was selected as a methodology to evaluate the impact of an education programme in Fiji.<sup>145</sup> Australia is among the first bilateral donors to use contribution analysis in its evaluations.146 Among others, it was found to have supported donor harmonisation by focussing on developing alternative explanations for progress and thereby allowing a greater appreciation of other donor activities and supporting synergies.<sup>147</sup> Moreover, contribution analysis was also considered to generate significant improvements in programme logic.<sup>148</sup> Sida has also used outcome mapping in an evaluation of civil society projects in Bosnia‐Herzegovina, and recognised its benefits over the logical framework approach.<sup>149</sup>

## **Multilateral Actors**

The multilateral actors examined have all recognised the importance of measuring impact. The AfDB, the OECD's Development Assistance Committee (OECD DAC) and the World Bank are all members of NONIE, a 'Network of Networks for Impact Evaluation comprised of the OECD DAC

#### *Measuring Impact* 37

Evaluation Network, the United Nations Evaluation Group (UNEG), the Evaluation Cooperation Group (ECG), and the International Organization for Cooperation in Evaluation (IOCE) – a network drawn from the regional evaluation associations'.<sup>150</sup> A key element is the focus on experimental design as the main approach to measuring impact. NONIE is one of the lead actors promoting the 'impact evaluation' methodology outlined in the previous section, and is a proponent of attribution analysis based on the counterfactual. It does note that in certain cases contribution analysis may be the most realistic approach, but it only addresses alternative methods in its annex. In its annex, there is one page on qualitative methods for assessing the effects of interventions including outcome mapping and MSC among others.151 Outcome mapping and MSC are methodologies that, as discussed above, are relevant to international actors engaged in support to rule of law and security institutions. The following provides examples of the approaches relevant multilateral actors have taken in practice:



#### 38 *Vincenza Scherrer*


The AfDB promotes self‐evaluations that assess attribution of socio‐ economic and policy changes. <sup>169</sup> The guidelines suggest that modified 'before‐and‐after' methods should be used to deduce attribution based on both quantitative and qualitative data.<sup>170</sup> In practice, it has considered conducting impact evaluations to be challenging at times due to the need for pre‐evaluation studies and the lack of available baseline data.171 The MSC approach has been used for the evaluation of an AfDB decentralisation strategy to identify 'difficult to quantify changes' that are not easily captured by traditional M&E. In this context stakeholder stories were developed to support the testing of the theory of change.<sup>172</sup>

EuropeAid's evaluation guidelines recognise the need to focus on attribution and contribution. The guidelines put forward four strategies: change analysis, meta‐analysis, attribution (counterfactual) analysis and contribution analysis. <sup>173</sup> While the choice of analysis is left to those conducting the evaluation, the guidelines mention the usefulness of the latter three strategies when answering cause‐and‐effect questions such as measuring the EC's attribution or contribution to an effect, the sustainability of that contribution and whether it is being achieved at a reasonable cost.174 It has used both MSC and contribution analysis in its evaluations.

The OECD DAC has a subsidiary body called the Network on Development Evaluation. The network's overall goal is to 'increase the effectiveness of development‐cooperation policies and programmes by promoting high‐quality, independent evaluation'. <sup>175</sup> The OECD DAC definition of impact developed by the network is the most widely used and accepted in the field of international development cooperation, as well as in peacebuilding.<sup>176</sup> In the area of development the network promotes rigorous impact evaluation, as a member of NONIE.<sup>177</sup> In the area of peacebuilding its Guidance on Evaluating Conflict Prevention and Peacebuilding Activities proposes a further range of different approaches, which can be adapted in peacebuilding settings. <sup>178</sup> Mixed‐method approaches are promoted to deal with the complexity of peacebuilding interventions. <sup>179</sup> The guidelines also recognise that in peacebuilding contexts impacts can be longer term or relatively immediate.<sup>180</sup>

The World Bank's independent evaluation group (IEG) works towards quality improvement of impact evaluation.<sup>181</sup> IEG conducts *ex post* impact evaluations several years after the end of a project, using scientific‐ experimental and quasi‐experimental designs.<sup>182</sup> IEG considers that 'good evaluations are almost invariably mixed method evaluations', and therefore uses qualitative data to inform the design and interpretation of quantitative impact evaluations.<sup>183</sup> Both experimental and non‐experimental designs have been used in practice to create counterfactual groups (see Table 4).

## **UN Actors**

Evaluation within the UN system takes place at different levels and follows a diverse set of arrangements. There are some UN entities that have specific evaluation roles, such as the Joint Inspection Unit, the Office of Internal Oversight Services and the UN Evaluation Group. While the JIU has no mandate for measuring impact, the OIOS can measure the impact of Secretariat programmes.<sup>184</sup> For OIOS, impact refers to 'the ultimate, highest level, or end outcome that is desired.'<sup>185</sup> It advocates three types of evaluation design, namely experimental (using counter‐factuals), quasi‐ experimental and non‐experimental. <sup>186</sup> According to its Inspection and Evaluation Manual, '[n]on‐experimental designs are the most commonly used evaluation design' by its evaluation division.<sup>187</sup> In general, it has been recognised that measuring impact is a major weakness in evaluations across the UN system, which can account for the increasing efforts to focus on this issue.<sup>188</sup> The need to measure impact has increasingly been addressed in UN reports. A 2009 report of the Committee for Programme and Coordination recommended that the General Assembly request the Secretary‐General to ensure that relevance and impact are the major focus of self‐evaluations.<sup>189</sup> Moreover, it was recommended that the various secretariats are made aware of 'the imperative of impact assessment'.<sup>190</sup>

Table 6 outlines what methods UN entities are increasingly promoting for the measurement of impact, and how they have addressed this issue in practice.


**Table 6: Examples of UN Entities' Approaches to Measuring Impact** 


Several UN entities advocate impact measurement in their policies and various guidelines: for example, according to a UNDP handbook, impact evaluations are useful when the 'project or programme is functioning long enough to have visible effects' and 'has a scale that justifies a more thorough evaluation'. <sup>212</sup> For UN WOMEN, demonstrating evidence of impact is considered to be the primary goal of M&E activities at the thematic level, especially regarding progress towards the goals of the Fund for Gender Equality. <sup>213</sup> According to UNHCR's evaluation policy, the 'primary concern of all evaluations is the impact of UNHCR's work on the rights and welfare of beneficiaries, even when the evaluation relates to entities and functions that do not have a direct impact on people of concern to the organization'.214

Approaches to measuring impact vary significantly across the entities examined. Methods that support attribution have at times been promoted but are rarely used by the UN in practice. UNICEF's equity‐focused evaluation resource centre presents a list of seven basic impact evaluation designs that include the use of randomized control trials or quasi‐ experimental designs, the latter using comparison groups that may or may not be statistically matched. <sup>215</sup> Where experimental designs are not feasible, non‐experimental designs are also advocated. <sup>216</sup> In practice however, the majority of UNICEF evaluations examined do not use rigorous scientific‐experimental approaches.<sup>217</sup>

A common theme among UN actors is the fact that practical evaluations at the impact level are less abundant than other types of evaluations even while the need to measure impact is often recognised in policy guidance: for example, although little public information is available regarding DPKO approaches to impact assessment, a recent review of DPKO arrangements for M&E of SSR suggests that mechanisms for measuring impact are generally lacking.<sup>218</sup> This is considered to be due to a focus on measuring internal performance instead of external impact on the delivery of security and access to justice.<sup>219</sup> In addition to lacking the capacities or mandates for measuring impact, the actual challenges of completing evaluations have also been raised. For instance, a UNIFEM meta‐evaluation noted that evaluations often maintain 'impact was not possible to assess due to insufficient passage of time, lack of baselines and/or problems inferring causality'.<sup>220</sup>

Other evaluations refer to impact, but the methodological basis for doing so is questionable: for example, the term 'impact' might be used, but not in accordance with the OECD DAC definition. It is therefore noted that 'some terms of reference and reports use the term "impact" loosely or incorrectly'.<sup>221</sup> Similarly, a meta‐evaluation conducted in 2005 of over 60 UNFPA evaluations noted that they tended to focus on the output level and therefore did not cover the criteria of impact and sustainability well.<sup>222</sup> This is echoed in a 2008 UNODC report that noted that there has been 'too much emphasis on process rather than impact'.223A global UNIFEM meta‐ evaluation also notes that neglecting impact is a key weakness in UNIFEM's approach to evaluations. It notes that the evaluation of impact should be more consistently applied (alongside the other OECD DAC criteria) when conditions are favourable, and that the assessment of impact would 'require the use of participatory and innovative evaluation approaches – which in turn would typically require additional resources for evaluations'.224 The only DPKO impact assessment found publicly available was commissioned by its gender unit to examine the impact of the implementation of UN Security Council Resolution 1325 in peacekeeping. It used a participatory approach based on the perceptions of contribution identified by government and civil society respondents, as well as comparison with pre‐mission conditions.225

In the humanitarian world there has been increasing interest in the idea of adopting a joint impact assessment approach for interventions. To this end an Inter‐Agency Consultation Group was created to define such an approach.<sup>226</sup> This was based on consultations with a wide group of actors to identify what the best approaches for measuring impact would be. The consultations were based on interviews with affected communities in 'Sudan, Bangladesh and Haiti, and local government and local NGOs in the same countries; with national government and international humanitarian actors in Haiti and Bangladesh; and with 67 international humanitarian actors, donors and evaluators in New York, Rome, Geneva, London and Washington'.<sup>227</sup> While this group was specifically looking at humanitarian interventions, it is of relevance given the inclusive consultation process it used as a basis for its development of a joint approach to evaluation.

A key element that came out of the process was the lack of support for quasi‐experimental design because of both ethical and practical challenges.<sup>228</sup> While it was noted that quasi‐experimental design is the preferable approach when seeking to determine attribution, it was recognised that 'the trade‐offs between appropriateness and rigour need to be taken into consideration. Quasi‐experimental design, usually involving large‐scale surveys where enumerators have limited interaction with affected people but rather fill out pre‐set questionnaires, is not conducive to participation.'<sup>229</sup> In terms of the approaches that were identified as most useful for measuring impact in the field, these were mainly participatory. Goal‐free evaluation was also raised as being an important approach due to its participatory nature and ability to capture unintended and indirect results.<sup>230</sup>

## **Summary**

While the information from this section is not specifically linked to rule of law and security institutions issues (due to the challenge of finding impact assessments focusing on these issues), there are several insights that can be drawn that are nonetheless relevant to this sector.

First, it is evident that all the actors examined are struggling with the same need to measure impact. For the bilateral actors this is sometimes a

#### 44 *Vincenza Scherrer*

result of budgetary and oversight requirements set out by their respective governments. For UN entities it is often related to the increased recognition of the importance of measuring impact from a policy perspective as well as pressure from member states to justify the use of resources. <sup>231</sup> For multilateral actors, the steps towards measuring impact can be inscribed within the efforts of the international networks they belong to that promote impact evaluation. While in most cases attribution is promoted due to the perception that it is a more rigorous approach, in practice demonstrating contribution is often seen as more feasible. However, even examples of impact measurement through contribution are limited.<sup>232</sup> In fact, it is often noted that more focus is needed on innovative approaches for the measurement of impact through contribution. For example, a UNIFEM meta‐evaluation has promoted the use of outcome mapping and MSC approaches as options to consider and in addition to contribution analysis, multilateral and bilateral actors are also increasingly using these two techniques. Participatory approaches in particular have gained increasing recognition as a way to enhance impact measurement. <sup>233</sup> However together with utilisation‐focussed approaches participatory approaches are also most commonly singled out in UN and non‐UN actors' policy documents and handbooks as needing more attention.

Second, this brief review highlights that in cases where experimental or quasi‐experimental scientific evaluation methods were employed to demonstrate attribution, these were very rarely undertaken in the area of rule of law and security institution activities: these approaches were mostly used in sectors such as rural development, education and health. This is reflected in a USAID meta‐evaluation noting that other approaches were often used in the area of peace and security. The explanation put forward was that 'education, health, and economic sectors rely to a greater extent on quantitative data than do evaluations in other sectors, and are therefore more inclined to statistical/quasi‐experimental designs'.234 A similar finding is evident from the overview of the approach promoted by the OECD. In terms of evaluations in the area of development, it advocates the use of scientific‐experimental impact evaluation. In the area of peacebuilding, however, the guidance recognises the utility of mixed approaches depending on the context at hand.<sup>235</sup>

Finally, in the few documented cases of scientific‐experimental approaches being used in the area of peacekeeping, notable challenges were identified. For example, one study on the impact of UNOCI on the population pointed to the challenge of there being 'no catch‐all way to separate the effects of an intervention, on the one hand, from effects of the factors of an area that prompted the intervention there in the first place'.236 As the assignment of peacekeeping presences to particular areas is based on necessity, it is difficult to construct a counterfactual comparison.237 Despite these constraints, the authors of these impact evaluations nonetheless noted the value of such scientific‐experimental approaches in measuring the impact of peacekeeping missions. However even though these evaluations considered an area closely related to support for rule of law and security institutions, they still did not focus specifically on such support and as a result are of limited relevance in understanding the challenges of measuring this kind of impact specifically. For example, the impact evaluation on UNOCI examined how the presence of the peacekeeping mission has given citizens a greater sense of security rather than how the mission's support to institution building has affected the well‐ being of the population while an evaluation report on the impact of UNMIL examined whether 'proximity to deployments is associated with more or less crime'.<sup>238</sup> Hence, while the evaluations were able to point to whether the presence of a peacekeeping mission had a direct impact on their intended beneficiaries, they did not examine the value of specific activities. Moreover, in line with the limitations posed by such methods, the evaluations were restricted in their ability to substantially explain why specific impacts were achieved or not.

This section has shown that although most international actors recognize the importance of measuring impact, there is no agreement on the most effective approach to doing so. While scientific‐experimental approaches are often promoted in policy guidance as being more 'rigorous', in practice, other approaches to measuring impact have gained recognition. Moreover, these other approaches (e.g. theory‐based, participatory, utilisation‐focused) are increasingly considered to provide significant advantages when measuring impact in complex environments. This is because they may be more apt at answering the questions international actors are seeking to understand and because they are based on more accessible skill‐sets and more modest resources.

The next section will take this analysis further by highlighting how impact can be measured in practice in peacebuilding contexts. In particular, it will examine questions such as the level at which impact should be measured, the human and financial resource constraints that affect how impact can be measured, the use of indicators in measuring impact, and engagement among international and national actors when measuring impact.

## **MEASURING IMPACT: KEY ISSUES FOR PEACEBUILDING SUPPORT**

There is no single best practice for measuring the impact of peacebuilding interventions on rule of law and security institutions in host countries. Rather it is important that actors and institutions adapt methodologies based on their own requirements and specificities. In doing so there is a need to understand the lessons and good practices that have already been developed by the international community. Moreover, the particularities of peacebuilding contexts that can influence the how impact can be measured need to be taken into account. This section takes the analysis set out in the previous sections further reflecting on the key issues that international actors often grapple with when considering impact measurement. The discussion looks at several key questions from a peacebuilding perspective in order provide the building blocks of an individualised approach to measuring impact.

The key questions that focus the discussion are:

1. What to measure and when?


## **What to measure and when?**

There is often a lack of clarity within the literature on what should be measured and when. This relates to basic theoretical disagreements on the level of the results chain at which impact is located. Similarly, the fact that many international actors need to measure support efforts on a yearly basis has led to an assumption that impact should also be measured on this basis. This sub‐section attempts to clarify some of these concerns.

## *What to measure?*

There is often confusion among international actors between measuring outcomes and impact.<sup>239</sup> Impact should be measured in accordance with the definition outlined in the OECD DAC guidance. This means focussing on impact at the level of goals while outcome is located at the lower level of the objectives of an intervention. Impact is widely considered more difficult to measure than outcomes because it is harder to show the connection between the intervention and the effect. That is why in practice a majority of evaluations are focused at lower levels of the results chain. If the aim is to measure impact, then outcomes will nonetheless be measured, but efforts have to be made to go beyond measuring outcomes in order to attempt to demonstrate how an outcome has led to an impact.

A certain realism is required as to what it is possible to measure and how this measurement squares with the potential impact a project can achieve: that is to say, it would be unfair to expect one small project in the area of rule of law and security institutions to have an impact at the level of enhanced security and justice for all. This also relates to the question of which level an impact is being measured at, whether the institutional level, the beneficiary level, or some other level? For the development community, impact is primarily focused at the beneficiary level.<sup>240</sup> Tracing impact at this level is still likely to involve measuring impact at the institutional level, but would have the added value of looking beyond that. This approach is reflected in the UNDP's evaluation handbook, which notes that an impact evaluation 'includes the full range of impacts at all levels of the results chain, including ripple effects on families, households and communities; on institutional, technical or social systems; and on the environment'.241 Therefore, rather than being limited to evaluating direct

#### *Measuring Impact* 49

effects on intended beneficiaries, it would look at effects on a second layer of beneficiaries. A similar understanding is reflected in the area of peacebuilding: the OECD DAC proposes a definition of impact for peacebuilding contexts which reads as 'the results or effects of any conflict prevention and peacebuilding intervention that lie beyond its immediate programme activities or sphere and constitute broader changes related to the conflict'.<sup>242</sup> This definition would meet needs in the area of rule of law and security institutions, which entails that impact should be measured in terms of behaviour and attitudinal change as well as institutional change.<sup>243</sup>

Another related issue is that of who should define the impact to be measured? That is to say, there are arguments that national actors should define the impact they are striving to achieve while in practice it is often international actors or donors that set the tone of the evaluation. The question is relevant because the impact that an international actor sets out to achieve may not correspond to stakeholders' expectations.<sup>244</sup> Identifying criteria for success in collaboration with national stakeholders would have the advantage of enhancing the relevance and context‐specificity of programmes. This could be supported through participatory or action‐ oriented approaches determining what is to be measured on a case‐by‐case basis, that is, essentially collectively defining the criteria for success. This requires, however, flexibility in being able to accept potential deviance from the originally intended goals.

## *When should impact be measured?*

There is a tendency for impact to be perceived as so long‐term that there is a need to wait years to measure impact. However, if one takes the definition of impact to include both longer‐term and shorter‐term impact, then evaluations at the impact level can also take place relatively soon after and even within the lifespan of an intervention. In practice, many international actors must report on their progress towards achieving expected accomplishments on a yearly basis. However this should not be confused with the need to measure impact. Impact assessments do not need to take place on a yearly basis, as they are complex undertakings that may require some time to have passed in order to show visible effects. Moreover, even within the same intervention, there may be certain components that lead to long‐term impact while others lead to short‐term impact. These issues raise the question of when and with what frequency international actors should be attempting to measure impact.

Options for when to measure impact include starting points both before and after the end of an intervention. Measuring impact ex‐post would entail the evaluation taking place at the end of the intervention or even several years later. This is less useful if the purpose of measuring impact is to learn and especially to allow adjustments in programming during the intervention.

An alternative option is therefore to measure impact during the intervention. This can include several scenarios. The first would be to measure impact at the terminal phase (e.g. in the end phase of an intervention but before completion). This would have the advantage of giving more time for impacts to occur while still allowing for minimal adjustments. Furthermore, the lessons identified can be fed into a final evaluation report that will improve future policy and planning. The second option would be to measure impact within a broader evaluation on a yearly basis, but only including impact as one component to be measured alongside other OECD DAC criteria. However, based on the overview of experiences of international actors, this risks watering down efforts to specifically focus on impact as in these kinds of approaches specific methodologies for measuring impact are rarely used. In fact, impact is often addressed as an 'add‐on' in these evaluations. A third option would be to measure impact on a yearly basis, but restricting the assessment to smaller, more limited elements, which cumulatively support a final more general impact assessment: for example, in the case of contribution analysis this could involve selecting a few strands of theory of change to test per year. Finally, a last option would be to conduct impact assessments on an ad hoc basis, for instance every three to five years depending on the needs of the intervention. This latter approach would provide the most flexibility for international actors, while at the same time ensuring that impact is being measured when there is really a need. This may best satisfy cost‐efficiency concerns however it also means giving up the potential advantages of strategically planning for evaluation as an integral part of an intervention and could lead to poorly planned and hastily executed assessments.

Ideally, there should be clear criteria in place on when impact should be measured. Guidance on impact evaluation suggests assessments at the impact level should be undertaken when:the intervention has taken place for enough time to show visible effects; the scale of the intervention in terms of numbers and cost is sufficient to justify a detailed evaluation; and/or the evaluation can contribute to 'new knowledge' on what works and what does not work.<sup>245</sup>

## **How to measure impact in practice?**

While international actors have used a variety of approaches and methodologies for measuring impact, there is no general agreement on how best to broach this question in practice. Common questions include: whether to choose an approach that enables attribution or contribution? Which approach or methodology to select? How to factor in the time, human and financial resource considerations? This sub‐section looks at these questions in detail in order to highlight the advantages and disadvantages of the different approaches from a peacebuilding perspective.

## *Attribution or contribution?*

When it comes to measuring impact, there is much debate between the value of measuring attribution or evaluating contribution. There is no right or wrong answer and essentially the answer to this question should depend on a range of factors, including the purpose of the evaluation, cost‐ effectiveness, and peacebuilding concerns, each of which is discussed below.


experimental approach that shows attribution may be useful. However, if the purpose is to answer evaluation questions aimed at learning (e.g. What went right/wrong? Why did this invention work or not work?), then methodologies supporting contribution would be more suitable (except in the case of theory‐based impact evaluation which can answer both sets of questions).


Table 7 recapitulates some of the advantages and disadvantages of attribution versus contribution as discussed in previous sections:

#### *Measuring Impact* 53


## **Table 7: Advantages and Disadvantages of Attribution versus Contribution**

*Which methodological approaches should be adopted?*

The choice of methodological approach is dictated by the decision to establish attribution or to demonstrate contribution (see discussion above). If attribution is to be established, the scientific‐experimental approach must be adopted, and can be combined to a certain extent with other approaches such as theory‐based evaluation. If contribution is to be demonstrated, options include combining elements of participatory, theory‐based or results‐based approaches among others.

Combining participatory and theory‐based approaches may be the most appropriate approach for adapting to the complex environments typical of peacebuilding. Several of the methodologies identified in section 2 combine participatory and theory‐based approaches (e.g. outcome mapping, contribution analysis). In terms of the scientific‐experimental approach, there is an emerging consensus among the evaluation community that experimental and quasi‐experimental methods 'are useful only for discrete and relatively simple interventions geared to precise and measurable objectives and characterized by well‐defined "treatments" that remain constant throughout program implementation'.<sup>247</sup> A further option is to use these methods to answer carefully limited and well defined questions within a broader qualitative evaluation.

When thinking about approaches to use it is helpful to consider those currently being promoted by other international actors. Compatibility of approaches is useful in pursuing joint assessments. A review of international actors' policy and practice suggests that the approaches that are gaining increasing recognition are participatory and utilisation‐focused evaluation approaches. This is reflected in the major consultation undertaken in the humanitarian community on how it should approach joint impact measurement. Clear preference was given to participatory approaches as well as potentially goal‐free evaluation, due to the participatory nature of these methods and their ability to capture unintended and indirect results. Scientific‐experimental approaches were deemed to be too logistically and ethically challenging. Utilisation‐focused approaches have also drawn increased attention due to their ability to enhance the ownership of the evaluation process by the project team, and their flexibility in adapting to diverse evaluation needs.

Table 8 recapitulates some of the advantages and disadvantages addressed in previous sections:


## **Table 8: Advantages and Disadvantages of Different Methodological Approaches**


## *Which methodologies/methods could be considered?*

There is no common agreement among international actors on the best approach to measuring impact. This is because there is no single right way to measure impact. In fact, it may be necessary to combine several methodologies and their associated techniques in order to overcome weaknesses and build on the strengths of individual methodologies. Once there is an agreement on the broad approach, specific methodologies can be considered.

Issues to consider when selecting a methodology include the challenges of conducting evaluations in peacebuilding environments. Significant material has been written on these challenges, so this paper does not dwell on them,<sup>248</sup> but rather sets out how some of the challenges may affect the selection of methodology.

 *Difficulty in conducting data collection.* Common challenges include security conditions hampering the ability to travel to certain sites in order to undertake data collection, baseline information not being available and data being hard to find. This speaks against methods relying on baselines (e.g. impact evaluation using experimental design) and speaks for techniques that do not necessarily require a baseline (e.g. MSC technique or non experimental designs).


general theory of change for a sector in the country in question. Contribution analysis is particularly well suited for such complex interventions. Individual strands can be tested at different points in time to enable an iterative contribution analysis.<sup>249</sup> For instance, feedback could be provided on a yearly basis on some of the individual strands comprising the global theory of change.<sup>250</sup> This would enable rapid adjustments where needed on the individual strands while feeding into a comprehensive impact assessment that brings all the strands together. It would also allow for differential timing of individual evaluations in the recognition that some elements of the theory of change need more time to produce effects than others.

There are also characteristics of rule of law and security institutions that make some methodologies more suitable than others.


#### *Measuring Impact* 59

examined, often beyond the intervention itself: For example, activities in the area of DDR may affect the way that support to SSR is provided and its results. Contribution analysis may be the most adequate approach for dealing with this type of challenge, as it enables the examination of multiple theories of change with different contribution strands that can be tested at different moments of the lifespan of the intervention.

 *Strengthening national ownership and capacity building*. These are core principles of most rule of law and security institutions components: in the area of SSR, for instance, it has been recognised that there is 'an urgent need to broaden the range of voices heard in most evaluations so that they more accurately represent the views of the populations that are meant to be the ultimate beneficiaries of SSR programmes.' <sup>253</sup> Participatory methodologies such as contribution analysis, MSC and outcome mapping would therefore be useful in these contexts.

This discussion has highlighted that each methodology has its strengths and weaknesses for measuring certain elements of support to rule of law and security institutions, and they can be combined to create a more complete picture of impact (see table 9 below on the advantages and disadvantages of the various methodologies). For example, ROA is suited to looking at projects that involve support to policy changes. Outcome mapping is better suited for those projects that focus on enabling behavioural change and where outcomes may be unpredictable. When measuring the impact on a sector such as rule of law and security institutions not every project will be able to be covered. Rather, certain projects/programmes are likely to be selected for assessment to provide a snapshot of the bigger picture. The selection of interventions to be assessed is likely to depend on whether the cost of the project merits special examination, whether the project is likely to generate important lessons learned, or whether the contribution to impact is questionable.<sup>254</sup> On this basis different methodologies may be best thought of as a set of tools to be drawn upon for measuring impact according to their appropriateness for the interventions selected for assessment.


#### **Table 9: Advantages and Disadvantages of Impact Assessment Methodologies**

*Measuring Impact* 61


The selection of methodology will depend on the priorities of the international actor (focus on accountability versus learning, cost versus benefits, etc.). Moreover, the ideal approach would be to triangulate results by using several methods and indeed there are commonalities among several of the methodologies, which facilitate triangulation. Some commonalities include: the use of baselines, theory of change, indicators or progress markers, change stories, workshops, and qualitative data collection methods (e.g. focus groups, interviews, observation). There are also common principles, which should be used to guide impact assessments: for example, inclusiveness, testing of the underlying theory of change, mix‐methods approaches using both qualitative and quantitative methods, and attention to unexpected impacts.<sup>256</sup>

## *How to factor in time, human and financial resource considerations?*

Several evaluations point to the prohibitive costs of measuring impact – primarily due to the challenges of data collection for impact as opposed to performance.<sup>257</sup> This section provides an overview of some of the human and financial resource considerations (table 10) and then highlights how methodologies can be made less expensive and time‐consuming.

## **Table 10: Time/Cost/Skills**<sup>258</sup>


*Measuring Impact* 63


In practice, several of the methodologies can be adapted so that they are less time‐intensive or cost‐intensive: for example, while impact evaluation based on genuine experimental design is promoted as the most rigorous measurement of impact, the situations where it can feasibly be used are limited. In fact, a 2006 study estimates that randomised control trials have only been used in 1‐2 per cent of impact evaluations, approximately less than 25 per cent use baseline surveys, and robust quasi‐experimental designs are adopted in less than 10 per cent.<sup>265</sup> There have therefore been broad attempts to make impact evaluation more adaptable to "real world" conditions, with options of techniques identified that can reduce time and costs.266 One technique, which could be relevant for international actors is *post‐intervention project and comparison groups with no baseline data*. This design uses the post‐intervention group as the counterfactual based on the assumption that any differences between the pre‐intervention group and post‐intervention group are due to the effects of the intervention.<sup>267</sup> By eliminating baseline data collection, such evaluations cost significantly less than comparable studies using experimental or quasi‐experimental designs. <sup>268</sup> However, the disadvantage with such simplified evaluation designs is that the results will be less reliable and robust from a methodological and statistical perspective.<sup>269</sup>

#### 64 *Vincenza Scherrer*

The participatory methodologies that enable capacity building of national stakeholders (e.g. outcome mapping and MSC) can also be adapted to be less time consuming by relying on external evaluation approaches as opposed to highly participatory ones. For example, the approach taken by MSC of asking beneficiaries several questions regarding the MSC can be adapted for use by external evaluators in their interviews or focus group meetings, without requiring training to be provided to beneficiaries to collect data in an iterative manner. The same goes for outcome evaluation, which can be adapted to be used at the end of an intervention by translating the intervention documents into behavioural terms to support the identification of boundary partners to be interviewed and outcome challenges to be tested.<sup>270</sup> In the case of a Sida evaluation, for example, external evaluators adapted the methodology to use progress markers in their surveys/questionnaires rather than relying on the generation of data by stakeholders.271

While this may not be the way the methodologies were intended to be used, in cases where there is a need to be pragmatic about what is feasible and adapting methodologies can still strengthen traditional evaluation approaches to focus more on impact.

## **What role for indicators in measuring impact?**

Indicators are at the heart of the monitoring and evaluation approaches commonly used by international actors, in particular as part of the results‐ based management framework. Against this background, some examination is required of how indicators can be used at the impact level. While indicators should be perceived as part of a methodology for measuring impact, this question is addressed separately due to the importance often granted to indicators in M&E systems. Nevertheless it must be emphasised that indicators are not adequate tools for measuring impact on their own. They can support efforts to understand impact, but they cannot replace a robust methodology for measuring impact.

## *How can indicators measure impact?*

Impact indicators are part of an organisation's performance measurement system.<sup>272</sup> They are located at the final goal level, and aim to measure highest‐level changes in the results chain. Indicators at the impact level have been used by several organisations. The usage of the term 'impact indicators' differs among them: for example, the UNDP lists three levels of indicators, impact, outcome and output.<sup>273</sup> In contrast, EuropeAid lists three different levels of impact indicators (in addition to regular output and outcome indicators): specific impact, intermediate impact and global impact. <sup>274</sup> Specific and intermediate impacts both cover the OECD standard definition of positive and negative, primary and secondary, direct or indirect, intended or unintended long‐term effects produced by a development intervention, with the slight difference that intermediate impacts are longer term in nature and the last cause‐and‐effect chain level that can be monitored effectively and at the same time demonstrate sufficient attribution. <sup>275</sup> For the health sector, for example, impact indicators would include data on child mortality, life expectancy, etc. However, EuropeAid does not develop indicators for the global impact level, which corresponds to the highest‐order goals, such as poverty reduction, as it considers that attribution and reasonable cause‐effect linkages are impossible to establish at this level.<sup>276</sup>

Finally a third example can be drawn from the field of DDR**.** The IDDRS Module 3.50 on Monitoring and Evaluating DDR Programmes classifies indicators another way.<sup>277</sup> The two broad categories of indicators are performance and impact. A third category is proxy indicators, which can also be related to impact. Accordingly, *performance indicators* seek to measure outputs and outcomes.<sup>278</sup> *Impact indicators* measure the 'overall changes in the environment that DDR aims to influence'. It is noted that 'impact indicators often use a composite set (or group) of indicators, each of which provides information on the size, sustainability and consequences of a change brought about by a DDR intervention. Such indicators can include both quantitative variables (e.g., change in homicide levels or incidence of violence) or qualitative variables (e.g., behavioural change among reintegrated ex‐combatants, social cohesion, etc.).' <sup>279</sup> *Proxy indicators* can be used in situations where data for impact measurement cannot be reliably collected, or are completely missing. They 'are variables that substitute for others that are difficult to measure directly [and] may reveal performance trends and make managers aware of potential problems or areas of success'.280

UNDP/BCPR also discusses proxy indicators in its 'How To Guide' for DDR, and notes that indirect or proxy indicators are typically used in cases when 'a result is an abstract concept. For example, a DDR programme measures impact ("security situation improved") by using four proxy indicators (violence, confiscated ammunition, confiscated weapons, suspects detained). Indirect indicators are also used if data for direct indicators is not collected frequently enough (e.g. household surveys are often conducted only every five years), or if data for direct indicators is too difficult, dangerous or expensive to collect.'<sup>281</sup>

In sum, while there are challenges to the development of impact indicators, experience shows that these challenges can be overcome in a number of satisfactory ways. A solution adopted by several actors is to divide impact indicators into those that can be realistically measured and those that would be more challenging. For example, one option would be to develop a list of intermediate indicators, which can be feasibly measured, and a list of longer‐term indicators, which might not be measurable in the immediate term but can serve to track progress over a longer time span. Another interesting approach is that of developing proxy indicators as an additional category to support the capture of some of the more intangible elements of impact at the highest level. Progress on the intermediate and proxy indicators would therefore suggest progress in reaching the higher level of impact.

## *What are the considerations when developing impact indicators in the area of peacebuilding support to rule of law and security institutions?*

In peacebuilding contexts, many practitioners feel that it is extremely difficult – if not impossible – to define indicators at the impact level.<sup>282</sup> Challenges in this task can be linked to two main themes: the difficulty of measuring results at such a high level, and the fact that the indicators are weakly designed to measure impact from the planning stage.

First, the issue of measuring the intangible is amplified in the area of rule of law and security institutions because issues such as 'peace, communal trust, and good governance are intangibles.'<sup>283</sup> It is therefore considered that peacebuilding interventions cannot be compared to development interventions in the field of health or education. This is particularly the case in the area of rule of law and security institutions where broad goals are often affected by a plethora of factors, for example at the political and social levels. Furthermore, indicators at the impact level are often of only limited usefulness because an intervention's contribution to impact is likely to be very small and abstract: 'The project might, for example, contribute to an increase in a democracy indicator by three points, but what does this really tell the project leadership?'<sup>284</sup>

Second, there is a sense among practitioners that even in circumstances where indicators at the impact level are identified, they are not really adequate representations of the impact level. For example, 'number of soldiers demobilised' has been used as an impact indicator but it is not able to demonstrate improvement to beneficiaries' lives. It could more adequately be considered an output indicator.<sup>285</sup> This is linked to a broader challenge with the use of indicators in general in so far as there are often 'design flaws'.<sup>286</sup> Such flaws have been reflected in many reports of evaluations: for example, UNIFEM's 2009 meta‐evaluation notes that 'the quality of the indicators in the logical frameworks varies considerably' and in some cases 'the indicators are too specific so that regular reports contain frequent repetitions of activities having been achieved. In others the indicators contain broad, unqualified statements.'<sup>287</sup> This problem is also mentioned in individual evaluations, which have recognised that indicators are often not tracked in practice and the right indicators are not always selected.<sup>288</sup> Challenges in developing appropriate indicators and using them in a regular and effective way are linked to broader difficulties in using the results‐based management framework in peacebuilding contexts.

In the area of measuring performance of law and justice institutions specifically, the World Bank has advocated for the use of a "mix of indicators".<sup>289</sup> This appears to be even more important when attempting to measure impact, as elements of impact may be difficult to capture and therefore must rely on a mix of indicators to adequately track progress. Impact indicators need to be specifically developed at the design stage to ensure that they enable measurement at the highest level of the results chain and are not relegated to the output or outcome levels.

Finally and as discussed above, an essential consideration in the area of peacebuilding support to rule of law and security institutions is the need to be able to capture behavioural change. This is an important component of effects at the impact level and indicators are often challenged in this respect. The progress markers discussed in the context of the outcome

#### 68 *Vincenza Scherrer*

mapping methodology provide an interesting example as they could be used to supplement regular indicators that are not as capable of assessing behavioural change. Progress markers are not as rigorous as regular indicators in that they are not intended to follow all of the recognized criteria promoting specific, measurable, achievable, relevant and time‐ bound (SMART) indicators. However, they can capture changes in behaviour and relationships that are difficult to perceive with the use of more traditional SMART indicators. In particular, they enable telling the 'story of change' as opposed to providing 'one‐off snapshots of change'.<sup>290</sup>

## **Who should be engaged, how and why?**

While effective and sustainable peacebuilding efforts rely on international actors coming together to provide support in a coherent manner, their roles and responsibilities in this area are highly fragmented. In an ideal world, a sector‐wide approach to measuring impact would be promoted to help address this challenge, as well as to measure the extent to which the support followed an integrated approach. Moreover, national actors would be engaged in order to support national ownership and capacity‐building in the area of M&E. This sub‐section will examine the question of who should be involved in measuring impact within the sector, what type of engagement should be promoted among international actors, and how engagement with national actors can be promoted when measuring impact.

## *Who should be involved in measuring impact within the sector?*

Evaluating support to rule of law and security institutions can be done in a sector‐wide or component‐based approach. Sector‐wide means that the whole sector is examined, while a component‐based approach would focus on individual evaluations, for example, for mine action, rule of law, DDR etc. The rationale for sector‐wide evaluations can be linked to the challenges of attributing impacts to any single project or programme, given the number of actors operating in post‐conflict environments. Focusing on these higher levels therefore allows examination of the contribution of different actors to the impact observed on national institutions and beneficiaries. Moreover, sector‐wide evaluation at the impact level is important in seeing the bigger picture and understanding how particular components have

complemented or undermined one another. For example, it has been noted that in the area of DDR, evaluations would benefit from being undertaken across a number of related sectors. <sup>291</sup> This is because DDR requires achievements to have taken place in other sectors (e.g. SSR) for it to be sustainable. Similarly, the European Commission has since 2001 centred its conflict prevention and peacebuilding strategy on an integrated approach encompassing 'different types of activity', 'different time dimensions', 'the activities of different actors' and 'different geographical dimensions'.292 It found that while the impact of its activities had not yet fully materialised and therefore could not be assessed, the integrated nature of its approach was crucial to the positive contribution it made to mitigating the root causes of conflict in three different countries.<sup>293</sup>

Sector‐wide approaches can sometimes be broken down into component parts. That is to say, for example, if contribution analysis was used to measure the impact at the sector level, multiple theories of change would have to be examined, each of which belongs to individual components. The final assessment would then look at how the individual strands relate to one another in order to form the 'big picture' on the sector‐wide level. In the area of SSR, it has been noted that most M&E is conducted at the project or programme level – often because the project level is considered to provide a more feasible unit of analysis due to the availability of specific project documents and related logframes. <sup>294</sup> However, even at this component‐level, there is a growing consensus 'on the need to move M&E beyond the project level to the sector and strategic level in fragile states'.<sup>295</sup>

The selection of approach has different implications. Sector‐wide evaluations often rely on joint evaluations (see discussion below on international actors). Moreover, a sector‐wide approach would imply the need for clear terms of reference and a joint understanding of criteria of success etc. This is because there needs to be a coordination mechanism at the strategic level. A number of obstacles remain to developing such an approach, including bureaucratic disincentives, different institutional cultures and timelines, and the reality that donors may prefer to account for the impact of each actor individually in order to enhance accountability over resources invested. There are nonetheless opportunities to enhance the coherence of evaluation efforts in the sector, and given the significant human and financial investments involved it is important not to reinvent the wheel but to look for synergies and avoid duplication to the greatest extent possible. Actors may consider building on approaches that have been recognised as useful by other international actors, such as participatory and utilisation‐based approaches. Simple efforts such as informing other actors of an intention to undertake a perception survey and allowing them to contribute to it or subsequently use it can reduce costs and ensure that actors are working on the basis of the same information.

## *What type of engagement should be envisaged among international actors?*

While evaluations are often conducted by individual actors, the value of conducting joint evaluations is also gaining increasing recognition, particularly establishing impact in areas where multiple actors are engaged. When focusing on sector‐wide approaches, the value of joint evaluations is further enhanced due to the challenges of differentiating between different entities' impacts in conflict prevention and peacebuilding contexts. <sup>296</sup> Therefore, joint evaluations have been said to 'provide the best way of assessing the cumulative and overall impacts of international programming in a single conflict context'.297 A further advantage of joint evaluations is the ability to pool resources. For instance, to address budgetary constraints, UNICEF's meta‐evaluation notes that where there is common interest, a good approach may be 'to institute a series of joint evaluations of intervention strategies in key areas with one or more partners among the international agencies'.<sup>298</sup>

While there are advantages to such approaches, there are also challenges. First, if an impact assessment is conducted with the sole purpose of proving attribution to an individual actor in an effort to justify resources, there is less likely to be room for joint evaluations that look at cumulative impact. This is less of a challenge for evaluations that seek to show plausible contribution. However, in cases where the joint approach is favoured, there can still be challenges when different theories of change are used by different entities, when planning and programming timelines do not correspond, or when methodologies used are not compatible with one another (e.g. a goal‐free approach would not necessarily be compatible with a theory‐based approach). Finally, in the case of joint evaluations, there is also a greater need to make special efforts to strengthen local ownership given the overburdening nature of joint efforts.299

Compatibility of approach must be considered and embracing the OECD DAC definition of impact goes a long way towards promoting compatibility given that a majority of international actors use the OECD DAC's understanding of impact (based on intended, unintended, positive and negative consequences). Further issues to consider are other entities' need to follow certain rules and regulations. Ultimately utilisation‐focused approaches for joint evaluations may leave the greatest leeway for establishing compatibility, as the approach selected depends on the specific purpose of the evaluation.

*What type of engagement should international actors envisage with national actors?*

International actors have increasingly recognised the need to engage with national actors when conducting evaluations. For example, a General Assembly resolution of 2004 'recognizes that national governments have primary responsibility for coordinating external assistance, including that from the United Nations system, and evaluating its impact in contributing to national priorities'.<sup>300</sup> This imperative must be contextualised in the growing trend towards supporting national ownership of evaluations through participatory approaches, as well as increasing joint and partner‐ led evaluations. For example, the OECD DAC's quality standards for development evaluation note the importance of partnership, coordination and alignment with national actors, as well as supporting capacity development.301

In evaluations of support to rule of law and security institutions, national actors can take on many different roles. A minimal approach would be to ensure that key stakeholders are engaged throughout the evaluation process: an example would be consulting with national authorities when developing the terms of reference for evaluation, including securing agreement on goals and theory of change. This could be extended by reaching out to those stakeholders that may be affected by the evaluation to make them aware of the evaluation and support their willingness to engage with the process. Ultimately, efforts should be made to ensure that the evaluation process recognises national and local evaluation plans and considers potential synergies.302

Other approaches with a higher degree of participation include involving national evaluators directly in the evaluation team itself which could mean ensuring that there are national experts on a team, or actively facilitating national actors to collect and monitor data (e.g. in the case of outcome mapping or MSC). The rationale for including national actors in such teams is simple: they have greater knowledge of local culture and politics, including stakeholders and how to engage with them, and often possess the necessary language skills. It also supports efforts to build the evaluation capacity of the country. Furthermore, it would reduce the common perception that evaluations are 'donor‐centric', aiming solely to accommodate the needs of international actors. If evaluation results are not produced with support from national partners, the actual use of the evaluation findings will often be very limited.<sup>303</sup>

However, participatory evaluation approaches pose risks and challenges, especially in conflict and post‐conflict settings. A significant challenge is the potential for biased and distorted findings when beneficiaries, partners and stakeholders have been involved in an armed conflict.<sup>304</sup> The culture of secrecy in rule of law and security institutions in host countries can also be a major obstacle for the collection of data and use of evaluation results.<sup>305</sup> The lack of sufficient evaluation expertise on the national level is another challenge often encountered<sup>306</sup> and one that highlights the need for increased focus on national capacity‐building for evaluation. Finally, one of the most significant challenges is finding a way to accommodate the different evaluation needs and perspectives of national partners, international donors and beneficiaries.<sup>307</sup> Action‐orientated, goal‐ setting approaches and utilisation‐focused evaluation can prove useful in aligning all stakeholders behind common goals and objectives.

The exact degree of participation and ownership should be carefully calibrated to the specific intervention and context. The OECD Guidance on Peacebuilding Evaluations suggests some questions to consider when deciding on the extent to which local beneficiaries, stakeholders and partner governments should be included in the design and conduct of the evaluation. These include reflecting on the degree of politicisation in the country, the potential for bias in the evaluation of success based on power relations, the feasibility of supporting the collective definition of theory of change, indicators, etc., and the ability to ensure that all relevant stakeholder views can be included in the evaluation in a meaningful manner.<sup>308</sup>

## **Summary**

This section has examined some of the key concerns for international actors seeking to measure impact in peacebuilding contexts. First, it examined the level and frequency at which impact should be measured. It has highlighted that it is not necessary to measure impact on a yearly basis but should instead be measured when there is a need to learn new information on what works and what does not work or when there are doubts about the extent to which the intervention is achieving expected results. Second, this section examined the advantages and disadvantages of different evaluation approaches and methodologies from a peacebuilding perspective. In particular, the discussion highlighted that there are significant differences in terms of resources and time required to conduct an evaluation depending on the methodology selected. It was also underlined nonetheless that there are ways to adapt methodologies to suit the demands of complex contexts and limited resources thereby making them more accessible for international actors. Third, this section reflected on the role of indicators in measuring impact. In particular, it was noted that while there are numerous challenges to establishing appropriate indicators at the impact level, these challenges have been overcome in several instances by various international actors. One approach is to divide indicators at the impact level into those that can realistically be measured and those that would be more difficult to use. Finally, this section considered who should be engaged in measuring impact, how and why. It was noted that there are increasing calls to promote sector‐wide approaches to measuring impact that would require joint approaches to measuring impact. There is also an increasing recognition of the need to engage national actors when measuring impact in order to support national ownership and capacity‐ building. In sum, this section highlights that while there are genuine challenges to measuring impact in peacebuilding contexts, it can be done. Successful impact measurement begins with an understanding of the good practices that have already been developed as well as recognition of the fact that any approach should be based on the individual needs and specificities of each actor and intervention.

## **CONCLUSION**

Understanding impact is necessary and desirable for international actors engaged in peacebuilding interventions. Support to rule of law and security institutions is a particularly important part of this peacebuilding agenda, but although closely linked these two activities are often promoted in a disconnected manner. Measuring the impact of international support in this area can therefore help to identify synergies, build coherence, and maximise the potential positive impacts they can achieve. Focusing on peacebuilding contexts and interventions, this paper has explored a range of approaches to measuring the impact of support to rule of law and security institutions and how these approaches and methods have been applied in practice. Approaches range from those that can demonstrate attribution to those that can evaluate contribution. The specific impact assessment methodologies identified include impact evaluation, theory‐ based impact evaluation, contribution analysis, outcome mapping, MSC, and ROA.

While acknowledging that much remains to be learnt in this area, the study nonetheless offers some key findings. First, there is no common agreement among international actors on the best approach to measuring impact. This is because there is no 'best' approach. Each approach and methodology has its strengths and weaknesses. In fact, the best way to measure impact is to combine several approaches and methodologies in order to build on their individual strengths and mitigate their weaknesses according to the context at hand. This recognizes that the triangulation of

#### *Measuring Impact* 75

methods offers the most valuable approach to developing strong impact statements. Moreover, this paper argues that the selection of a combination of approaches and methodologies to be applied should be based on four criteria: firstly, the purpose of the evaluation (accountability or learning); secondly, the questions the evaluation seeks to answer (What impact? Why was the impact such?); thirdly, cost‐effectiveness in view of the task at hand; and fourthly, the specific constraints of the peacebuilding context.

A second key finding relates to the scientific‐experimental approach to evaluations, which has often been promoted in the development field as the only 'rigorous' approach to measuring impact. This is because it is based on counterfactual analysis – that is to say an understanding of the situation of the beneficiaries had the intervention not taken place – substantiated by control trials and subject to statistical analysis. While a counterfactual is necessary for measuring impact in the sense of attribution, this paper has also shown that impact can be measured on the basis other evaluation approaches. Theory‐based approaches (i.e. testing the theory of change) and participatory approaches (looking to beneficiaries in order to hear practical examples of what has changed in their lives) can also provide the basis for establishing a suitable counterfactual. Actors seeking to measure impact can therefore adopt a host of alternative approaches that may be more amenable to the complexities of peacebuilding environments and the goals and resources of the evaluation. This entails being able to use methodologies that might have lesser costs or skill requirements than those traditionally promoted as a basis for impact measurement.

A third key finding is the understanding that there are small steps that can be taken to strengthen traditional evaluation approaches to focus more on impact. For instance, if the capacity or will to invest in fully‐fledged impact assessments is lacking, it is possible nonetheless to adapt these methodologies to be less costly and less time consuming. Impact evaluation can be rendered less expensive by using alternative techniques such as post‐intervention project and comparison groups with no baseline data. Participatory methodologies, which are often dismissed as being too time consuming, can also be made more user‐friendly: for instance, in the case of the MSC technique, rather than relying on numerous participatory workshops and storytelling, an external evaluator can insert the 'most significant change' question into interviews or focus group meetings. This also removes the need to provide training to beneficiaries on data collection and analysis. The same goes for outcome mapping: progress markers can be integrated into surveys rather than relying on the generation of data by stakeholders. While this is not the way these methodologies were originally developed for use, such adaptation allows international actors to focus more on impact by supplementing their traditional evaluations with new techniques.

A fourth key finding notes that measuring impact can be a significantly political undertaking. There is a risk that an evaluation may shed light on the failings of an intervention to achieve its desired impact, and in that case there needs to be clarity on whether all actors are willing to confront this reality and what they can do with this information. This requires a certain realism in terms of the expectations of both the intervention and of the evaluation. Concerning the intervention, it must be clear that a relatively small project cannot be expected to have a huge or even a significant impact. Nor is it likely that it will have an impact at the level of broader peace and security. Clarity and agreement are needed from the outset on what type of impact can be reasonably expected. Second, an evaluation cannot be both rigorous and 'quick and dirty'. That is to say that there is a risk that assessments that have not been conducted with sufficient planning, time and resources, may be placed on a pedestal and used to make important decisions simply because they were labelled impact assessments. However, a proper impact assessment requires significant resources and goes beyond a basic evaluation at the output level. Actors must understand what can realistically be expected from the range of available methodologies but also from the resources invested.

Finally, and against this background, attempting to measure impact is – or ought to be – more expensive and time‐consuming that an evaluation at a lower level of the results chain. It should be clear that there is no need to measure impact on a yearly basis. In fact, it has been suggested that impact should be measured when the intervention has been in place for long enough to show observable effects, that the scale of the intervention (numbers and cost) justifies measuring impact, and/or the evaluation can contribute to new understandings on what works and what does not.<sup>309</sup> The same logic can be applied to what should be measured within a large intervention – not every project needs to undergo an impact assessment, freeing resources to target those areas of the theory of change which are

least clear and would benefit most from a more thorough assessment. An informed distribution of resources would help to mitigate a common critique of peacebuilding activities, which is that there is a lack of clarity as to whether an intervention is achieving its expected results due to an inconsistent theory of change or challenges in the way the intervention is being carried out.

In terms of the way ahead, there is a need to recognise that measuring impact provides an opportunity to engage with national actors. There have been increasing calls within the policy community to support national capacity‐building through evaluations. Building capacity for evaluations is fundamental for the professionalism and effectiveness of security and justice institutions, and yet is an oft‐neglected component of such programmes. Using elements of participatory approaches can help to build this national capacity, as well as foster an understanding among national actors of M&E as a normal and useful component of such activities. While supporting national ownership through impact assessments should be encouraged, the extent of this kind of capacity‐building should be decided according to the conditions of each context.

In moving forward there is also a need to promote a 'culture of learning' as opposed to a 'culture of blame' in evaluations.<sup>310</sup> This is likely to resonate strongly in the case of impact assessments, which as discussed above can be highly political and politicised undertakings. International actors should take this challenge into account when developing their approach to measuring impact, and should accompany their attempts to measure impact with an effective communications strategy that highlights the importance of evaluations as a learning mechanism. Establishing a culture of learning also relates to the need to ensure that the results of an evaluation are actually used to support adjustments in policy and practice based on an enhanced understanding of what is working and what not. Procedures for ensuring this is the case should be included in any approach to measuring impact – including timelines and responsibilities. Moreover, mechanisms for sharing findings with national (and other international partners) should be developed.

This paper has highlighted that measuring impact is essential in the area of peacebuilding support to rule of law and security institutions. Moreover, it shows that there are a range of feasible approaches and methodologies to support international engagement in this area. Ultimately the approach taken will vary from case to case. Piloting various approaches will lead to a better understanding of what works for different actors operating in quite distinct contexts. There is also a need for applied research on the practical implementation of theoretical and methodological approaches to measuring impact. This includes the compilation of detailed knowledge on evaluations that have used different techniques in the area of support for rule of law and security institutions. Only then can we grasp the quite distinct challenges encountered as well as achieve a more sophisticated understanding of the methodologies that can be best applied to learning about the impact of peacebuilding interventions on rule of law and security institutions.

## **NOTES**


support to rule of law and security institutions. The need to measure impact is also


Eric Mvukiyehe and Cyrus Samii, *Laying a Foundation for Peace? A Quantitative Impact*


Money', Report 1, (London: ICAI, November 2011), p. 14. <sup>114</sup> *Ibid.* p. 15. <sup>115</sup> Roger Drew, *Synthesis Study of DFID's Strategic Evaluations <sup>2005</sup> – 2010: <sup>A</sup> report produced for the Independent Commission for Aid Impact* (London: ICAI, 2011), p. 59.

#### *Vincenza Scherrer*


*Maternal and Child Health and Nutrition Outcomes In Bangladesh* (London: DFID,


#### *Vincenza Scherrer*


(FAO), the UK Department For International Development (DFID), the Active Learning Network for Accountability and Performance (ALNAP), the Emergency Capacity Building Project (ECB), DARA International, Humanitarian Accountability Partnership International


*Complex: Attribution, Contribution and Beyond,* (Comparative Policy Evaluation vol. 18),


*Conducting Quality Impact Evaluations Under Budget, Time and Data Constraints,* Independent Evaluation Group (Washington DC: World Bank, 2006); Michael Bamberger, Jim Rugh, Linda Mabry, *Real World Evaluations: Working under Budget, Time, Data and Political Constraints* (Thousand Oaks, CA: Sage Publications, 2006). Michael Bamberger,



<sup>310</sup> Tony Beck, Francoise Coupal, Scott Green and Christina Bierring, *Strengthening Evaluation for Improved Programming ‐ UNFPA Evaluation Quality Assessment* (New York: UNFPA, December 2005), p. xii.

## **Measuring the Impact of Peacebuilding Interventions on Rule of Law and Security Institutions**

## Vincenza Scherrer

06SSRpaperBACK\_16pt (1).ai 1 31.05.2012 17:24:47 6SSRpaperBACK\_16pt

C M Y CM MY CY CMY K

Since the 1990s, internationally-supported peacebuilding interventions have become increasingly prominent. Activities focusing on rule of law and security institutions are a key component of this agenda. Despite increasing calls for more rigorous analysis of the impact of peacebuilding interventions, conceptual advances have been limited. There is little clarity on what is working, what is not, and why. This SSR Paper seeks to address this gap by mapping relevant approaches and methodologies to measuring impact. It examines how international actors have approached these questions in relation to support to rule of law and security institutions in complex peacebuilding environments. Most significantly, the paper demonstrates that measuring impact is not only feasible but necessary in order to maximise the effectiveness of major international investments in this field.

**Vincenza Scherrer** is Programme Manager of the United Nations and Security Sector Reform programme at the Geneva Centre for the Democratic Control of Armed Forces (DCAF). She has provided support to UN entities in the areas of policy research and guidance development on topics ranging from national security policy making to the nexus between disarmament, demobilization and reintegration and security sector reform. Prior to joining DCAF, Vincenza worked at the UNDP's Bureau for Crisis Prevention and Recovery and at the UK Parliament. Vincenza holds degrees from the Graduate Institute of International Studies in Geneva and the London School of Economics.

published by **DCAF** (Geneva Centre for the Democratic Control of Armed Forces) PO Box 1361 1211 Geneva 1 Switzerland

**www.dcaf.ch**